---
title: "Reference: ElevenLabs | Voice"
description: "Documentation for the ElevenLabs voice implementation, offering high-quality text-to-speech capabilities with multiple voice models and natural-sounding synthesis."
---

# ElevenLabs

The ElevenLabs voice implementation in Mastra provides high-quality text-to-speech (TTS) and speech-to-text (STT) capabilities using the ElevenLabs API.

## Usage Example

```typescript
import { ElevenLabsVoice } from "@mastra/voice-elevenlabs";

// Initialize with default configuration (uses ELEVENLABS_API_KEY environment variable)
const voice = new ElevenLabsVoice();

// Initialize with custom configuration
const voice = new ElevenLabsVoice({
  speechModel: {
    name: "eleven_multilingual_v2",
    apiKey: "your-api-key",
  },
  speaker: "custom-speaker-id",
});

// Text-to-Speech
const audioStream = await voice.speak("Hello, world!");

// Get available speakers
const speakers = await voice.getSpeakers();
```

## Constructor Parameters

<PropertiesTable
  content={[
    {
      name: "speechModel",
      type: "ElevenLabsVoiceConfig",
      description: "Configuration for text-to-speech functionality.",
      isOptional: true,
      defaultValue: "{ name: 'eleven_multilingual_v2' }",
    },
    {
      name: "speaker",
      type: "string",
      description: "ID of the speaker to use for text-to-speech",
      isOptional: true,
      defaultValue: "'9BWtsMINqrJLrRacOk9x' (Aria voice)",
    },
  ]}
/>

### ElevenLabsVoiceConfig

<PropertiesTable
  content={[
    {
      name: "name",
      type: "ElevenLabsModel",
      description: "The ElevenLabs model to use",
      isOptional: true,
      defaultValue: "'eleven_multilingual_v2'",
    },
    {
      name: "apiKey",
      type: "string",
      description:
        "ElevenLabs API key. Falls back to ELEVENLABS_API_KEY environment variable",
      isOptional: true,
    },
  ]}
/>

## Methods

### speak()

Converts text to speech using the configured speech model and voice.

<PropertiesTable
  content={[
    {
      name: "input",
      type: "string | NodeJS.ReadableStream",
      description:
        "Text to convert to speech. If a stream is provided, it will be converted to text first.",
      isOptional: false,
    },
    {
      name: "options",
      type: "object",
      description: "Additional options for speech synthesis",
      isOptional: true,
    },
    {
      name: "options.speaker",
      type: "string",
      description: "Override the default speaker ID for this request",
      isOptional: true,
    },
  ]}
/>

Returns: `Promise<NodeJS.ReadableStream>`

### getSpeakers()

Returns an array of available voice options, where each node contains:

<PropertiesTable
  content={[
    {
      name: "voiceId",
      type: "string",
      description: "Unique identifier for the voice",
      isOptional: false,
    },
    {
      name: "name",
      type: "string",
      description: "Display name of the voice",
      isOptional: false,
    },
    {
      name: "language",
      type: "string",
      description: "Language code for the voice",
      isOptional: false,
    },
    {
      name: "gender",
      type: "string",
      description: "Gender of the voice",
      isOptional: false,
    },
  ]}
/>

### listen()

Converts audio input to text using ElevenLabs Speech-to-Text API.

<PropertiesTable
  content={[
    {
      name: "input",
      type: "NodeJS.ReadableStream",
      description: "A readable stream containing the audio data to transcribe",
      isOptional: false,
    },
    {
      name: "options",
      type: "object",
      description: "Configuration options for the transcription",
      isOptional: true,
    },
  ]}
/>

The options object supports the following properties:

<PropertiesTable
  content={[
    {
      name: "language_code",
      type: "string",
      description: "ISO language code (e.g., 'en', 'fr', 'es')",
      isOptional: true,
    },
    {
      name: "tag_audio_events",
      type: "boolean",
      description: "Whether to tag audio events like [MUSIC], [LAUGHTER], etc.",
      isOptional: true,
    },
    {
      name: "num_speakers",
      type: "number",
      description: "Number of speakers to detect in the audio",
      isOptional: true,
    },
    {
      name: "filetype",
      type: "string",
      description: "Audio file format (e.g., 'mp3', 'wav', 'ogg')",
      isOptional: true,
    },
    {
      name: "timeoutInSeconds",
      type: "number",
      description: "Request timeout in seconds",
      isOptional: true,
    },
    {
      name: "maxRetries",
      type: "number",
      description: "Maximum number of retry attempts",
      isOptional: true,
    },
    {
      name: "abortSignal",
      type: "AbortSignal",
      description: "Signal to abort the request",
      isOptional: true,
    },
  ]}
/>

Returns: `Promise<string>` - A Promise that resolves to the transcribed text

## Important Notes

1. An ElevenLabs API key is required. Set it via the `ELEVENLABS_API_KEY` environment variable or pass it in the constructor.
2. The default speaker is set to Aria (ID: '9BWtsMINqrJLrRacOk9x').
3. Speech-to-text functionality is not supported by ElevenLabs.
4. Available speakers can be retrieved using the `getSpeakers()` method, which returns detailed information about each voice including language and gender.
